Major update that improves support for formulas specification #582

stefvanbuuren · 2023-09-11T06:46:30Z

reintroduces the square predictorMatrix
defines conversion functions p2f(), p2c(), f2p(), n2b(), b2n()
defines validate.blocks(), validate.predictorMatrix()
extends edit.setup() to formulas and blots
for reading ease, use ~ 1 for the empty predictor set instead of ~ 0
does not automatically set method = "" for variables that are not imputed (NOTE: DECISION REVERTED. SEE BELOW)
as far as possible, changes the leading argument to formulas (instead of blocks or predictorMatrix)
adds function typecodes() in sampler() to reduce multiple predictorMatrix lines to one (support for multivariate imputation methods)
implement new logic in samper.univ()
outcomments some tests that depend on hard-coded parameter estimates
sharpens test for equality between predictorMatrix and formulas specifications

- reintrocudes the square predictorMatrix - defines conversion functions p2f(), p2c(), f2p(), n2b(), b2n() - defines validate.blocks(), validate.predictorMatrix() - extends edit.setup() to formulas and blots - for reading ease, use "~ 1" for the empty model instead of "~ 0" - does not automatically set method = "" for variables that are not imputed - as far as possible, changes the leading argument to formulas (instead of blocks or predictorMatrix) - adds function typecodes() in sampler() to reduce multiple predictorMatrix lines to one (support for multivariate imputation methods) - implement new logic in samper.univ() - outcomments some tests that depend on hard-coded parameter estimates - sharpens test for equality between predictorMatrix and formulas specifications

stefvanbuuren · 2023-09-11T07:00:04Z

Ideas for further development:

add news function to YAML so that they appear on site
soft replace of blocks by nest (character vector with length ncol(data) with block names. The default is colnames(data))
Provide a way for the user to see head of design matrix created in sampler.univ(). Add examples that exploit formulas to add interactions, nested variables, by-processing and other advanced models
Describe differences and equivalences between predictorMatrix and formulas specification
...

stefvanbuuren · 2023-09-11T21:43:47Z

In preparation to tweaking documentation, converts Rd tags to roxygen2 tags.
Adds new functions to YAML

…block. Update make.method() so that homogeneous types and nlevels within a block get an appropriate default method.

…, ] to zero when variable j is member of a block for which no imputations are needed.

stefvanbuuren · 2023-09-13T09:23:58Z

Commits 5c6bee2 and 755c23a generalise the classic behaviour of the predictorMatrix to blocks.

It works as follows:

mice() uses the nimp() function to calculate the number of imputations needed for a given block of variables;
if the number of needed imputations in block j is zero, the following happens:

mice() sets method[j] <- ""
mice() sets predictorMatrix[v, ] <- 0 for all variables v in block j

This PR also removes the error message mice detected constant and/or collinear variables. No predictors were left after their removal. Imputations will be generated without predictors by the intercept-only imputation model (not recommended in general).

WARNING: Setting predictorMatrix[v, ] <- 0 does not prevent imputation of variable v. To prevent imputation of v, specify the appropriate entry of method as "".

…etup()

stefvanbuuren · 2023-09-13T09:56:07Z

Commit c2da03c cleans up the internal function edit.setup(). It return the proper formulas of the reduced model, but it is not quite right for meth, vis and post. Added FIXME.

…argument

- Stricter controls on input predictorMatrix - Output test of mids object

stefvanbuuren · 2023-09-18T08:25:16Z

New behaviours

Prevention of NA propagation by removing incomplete predictors. This version detects when a predictor contains missing values that are not imputed. In order to prevent NA propagation, mice() does the following actions: 1) removes incomplete predictor(s) from the RHS, 2) adds incomplete predictor(s) to formulas (var ~ 1) and block components, sets method[var] = "", and sets the predictorMatrix column and row to zero
The predictorMatrix input can be a square submatrix of the full predictorMatrix. mice() will augment predictorMatrix to the full matrix and always return a p * p named matrix corresponding to the p columns in the data. The inactive variables will have zero columns and rows.
The predictorMatrix input may be unnamed if its size is p * p. For other than p * p, an unnamed matrix generated an error.

Changes

Adds supports a tiny predictorMatrix
Solves bug in f2p()
Adds new function remove.rhs.variables()
Adds a validate.mids() check at exit that errors if rownames(predictorMatrix) differ from colnames(data). Some more output tests need to be added.
Removes codes designed to work specifically with a non-square predictorMatrix
Generates an error if predictorMatrix has fewer rows than length of blocks

stefvanbuuren · 2023-09-18T08:53:57Z

Exit checks added:

rownames(predictorMatrix) must match colnames(data)
length of formulas and blocks must be equal
length of formulas and method must be equal
length of method vector cannot exceed number of variables
length of imp and number of variables must be equal

…ted issues in check.blocks(), make.method(), edit.predictorMatrix()

…o impute (ynames)

…ix` arguments

stefvanbuuren · 2023-09-22T09:55:11Z

New behaviours and features thus far

TWO SEPARATE INTERFACES FOR MODEL SPECIFICATION: This version promotes two interfaces to specify imputations models: predictor (predictorMatrix + parcel + method) and formula (formulas + method). This version does not accept anymore accept mixes of predictorMatrix and formulas arguments in the call to mice().
NA-PROPAGATION PREVENTION. This version detects when a predictor contains missing values that are not imputed. In order to prevent NA propagation, mice() can follow two strategies: "Autoremove" (remove incomplete predictor(s) from the RHS, set method to "", adapt predictorMatrix, formulas and blocks, write to loggedEvents), or "Autoimpute" (Impute incomplete predictor and adapt method, predictorMatrix, formulas, and so on). "Autoremove" is implemented and current default. Use mice(..., autoremove = FALSE) to revert to old behavior (NA propagation).
SUBMODELS: The predictorMatrix input can be a square submatrix of the full predictorMatrix when its dimensions are named. mice() will augment the tiny predictorMatrix to the full matrix and always return a p * p named matrix corresponding to the p columns in the data. Unmentioned variables are not imputed, and the predictorMatrix, formulas and method are adapted accordingly.
DROP NON-SQUARE PREDICTOR MATRIX: Version 3.0 introduced non-square versions, but its interpretation turned out to be complex and ambiguous. For clarity, this update works with a predictor matrix that is square with both dimensions identically named with the names of the variables in the data. Variable groups are now specified through the parcel argument.
NEW PARCEL ARGUMENT. There is a new parcel argument that is easier to use. The print of the mids object shows parcel when it is different from the default. parcel can take over the role of blocks in specification. blocks is soft-deprecated, but still widely used within the program code.
NEW DOTS ARGUMENT. The blots argument is renamed to dots
EXIT VALIDATION: Adds a new validate.mids() checks the mids object before exit.

…ternal

stefvanbuuren · 2023-10-02T20:58:29Z

Three proposed changes to new behaviour

NA-PROPAGATION. It is better to use NA-PROPAGATION by default. The reason is that the user becomes aware of a potential model specification problem (e.g. not imputing a variable used as a predictor). mice() should offer two easy ways to solve the problem: "autoremove" and "autoimpute". We prefer the NA-PROPAGATION default because it alerts the user, whereas the other two options would "magically" make the problem disappear (and thereby downgrade model specification hygiene).
The formula of a complete variable is now something like age ~ 1. It is better to use age ~ 0, to signal that for the dependent not even the intercept-only model is used.
The formulas argument return as environment attached to the each formula. This environment does not seem to necessary in mice(), so it is cleaner to remove environment.

stephematician · 2024-04-13T04:58:39Z

R/predictorMatrix.R

@@ -82,9 +83,23 @@ check.predictorMatrix <- function(predictorMatrix,
    )
  }

-  # calculate ynames (variables to impute) for use in check.method()
+  # NA-propagation prevention
+  # find all dependent (imputed) variables
  hit <- apply(predictorMatrix, 1, function(x) any(x != 0))


Can be simplified to: apply(predictorMatrix != 0, 1, any)

stephematician · 2024-04-13T04:59:28Z

R/predictorMatrix.R

+  # find all variables in data that are not imputed
+  notimputed <- setdiff(colnames(data), ynames)
+  # select uip: unimputed incomplete predictors
+  completevars <- colnames(data)[!apply(is.na(data), 2, sum)]


!apply(is.na(data), 2, any) might be more efficient

…me tests)

Merge branch 'master' into mice4 # Conflicts: # DESCRIPTION # NEWS.md # R/ampute.continuous.R # R/edit.setup.R # R/futuremice.R # R/mice.R # R/mira.R # R/parlmice.R # R/predictorMatrix.R

stephematician · 2024-06-17T06:24:05Z

R/blocks.R

@@ -157,6 +156,16 @@ check.blocks <- function(blocks, data, calltype = "pred") {
    ))
  }

+  # save ynames (variables to impute) for use in check.method()
+  ynames <- unique(as.vector(unname(unlist(blocks))))


Is as.vector redundant for the return value from unlist?

Merge branch 'master' into support_blocks # Conflicts: # DESCRIPTION # NEWS.md # R/ampute.R # R/method.R # R/mice.R # R/mice.impute.rf.R # man/ampute.Rd # man/mice.Rd # man/mice.impute.rf.Rd # tests/testthat/test-parlmice.R

Convert documentation Rd tags to markdown tags for roxygen2

ea84be3

stefvanbuuren added 2 commits September 12, 2023 15:54

Add a data argument to nimp() to calculate number of imputations per …

5c6bee2

…block. Update make.method() so that homogeneous types and nlevels within a block get an appropriate default method.

Restore classic predictorMatrix behaviour that sets predictorMatrix[j…

755c23a

…, ] to zero when variable j is member of a block for which no imputations are needed.

Clean up source, identicate that there is still a problem with edit.s…

c2da03c

…etup()

stefvanbuuren linked an issue Sep 13, 2023 that may be closed by this pull request

How should mice behave when variables are not specified in the model #583

Open

stefvanbuuren added 3 commits September 13, 2023 22:23

Create a make.nest(), n2b() and b2n() function for working with nest …

28821a6

…argument

Insist that predictorMatrix has a zero diagonal

731bf25

- Prevention of NA propagation

8f92307

- Stricter controls on input predictorMatrix - Output test of mids object

Add exit checks on mids object

772c876

stefvanbuuren added 14 commits September 18, 2023 14:57

Add test for zero predictorMatrix row if method == "", deal with rela…

465bd5c

…ted issues in check.blocks(), make.method(), edit.predictorMatrix()

Update news

c8ed335

Update documentation for mice() arguments

05a0209

Update list of builtin imputation methods

6033fc6

Reorder sequence of mice() arguments

29fee22

Reorder nest in data sequence

fef881b

Use lowercase 'b' and 'f' for automatic naming of blocks and formulas

ba383eb

Update error message in mpmm

4175534

Sort terms both for pred and formulas

0166992

Create a mechanism to inform check.method() of the set of variables t…

35b6084

…o impute (ynames)

Introduce NA types in initialize.imp()

65f544f

Update nest printing in print.mids()

d9c6fa6

Add support for blots to multivariate imputation models

b9e398e

Rename nest to parcel

0345ec3

stefvanbuuren added 8 commits September 21, 2023 16:20

Use lower case default block names

07a79e9

Rename blots to dots

53916f4

Rename files from blots/nest to dots/parcel

3c09055

Add deprecation support for make.blots()

3cebc30

Implement autoremove in check.predictorMatrix() and check.formulas()

7b7a17c

Write one loggedEvent for each removed variable

8c4bb38

Abort mice when user speficies mixes of formulas and `predictorMatr…

24688b1

…ix` arguments

Update NEWS.md

e1c475f

stefvanbuuren added 4 commits October 2, 2023 21:30

Reorder mice() arguments into a clusters of operations

da6396b

Remove superfluous construct.parcel(), make remove.rhs.variables() in…

db5caf6

…ternal

Add MICE 4 Syntax Documentation CONCEPT as a vignette

f5d5c99

Rebuild site to include article mice4syntax

6edcd71

stephematician reviewed Apr 13, 2024

View reviewed changes

stefvanbuuren added 3 commits April 17, 2024 22:42

Add test for character variable (#601)

232a0b6

Merge main and support_blocks into new branch mice4 (still failing so…

09e58ea

…me tests)

Merging update

15321b4

Merge branch 'master' into mice4 # Conflicts: # DESCRIPTION # NEWS.md # R/ampute.continuous.R # R/edit.setup.R # R/futuremice.R # R/mice.R # R/mira.R # R/parlmice.R # R/predictorMatrix.R

stephematician reviewed Jun 17, 2024

View reviewed changes

Update support_blocks with master

deac372

Merge branch 'master' into support_blocks # Conflicts: # DESCRIPTION # NEWS.md # R/ampute.R # R/method.R # R/mice.R # R/mice.impute.rf.R # man/ampute.Rd # man/mice.Rd # man/mice.impute.rf.Rd # tests/testthat/test-parlmice.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major update that improves support for formulas specification #582

Major update that improves support for formulas specification #582

stefvanbuuren commented Sep 11, 2023 •

edited

Loading

stefvanbuuren commented Sep 11, 2023

stefvanbuuren commented Sep 11, 2023 •

edited

Loading

stefvanbuuren commented Sep 13, 2023 •

edited

Loading

stefvanbuuren commented Sep 13, 2023

stefvanbuuren commented Sep 18, 2023

stefvanbuuren commented Sep 18, 2023

stefvanbuuren commented Sep 22, 2023 •

edited

Loading

stefvanbuuren commented Oct 2, 2023 •

edited

Loading

stephematician Apr 13, 2024

stephematician Apr 13, 2024

stephematician Jun 17, 2024

Major update that improves support for formulas specification #582

Are you sure you want to change the base?

Major update that improves support for formulas specification #582

Conversation

stefvanbuuren commented Sep 11, 2023 • edited Loading

stefvanbuuren commented Sep 11, 2023

stefvanbuuren commented Sep 11, 2023 • edited Loading

stefvanbuuren commented Sep 13, 2023 • edited Loading

stefvanbuuren commented Sep 13, 2023

stefvanbuuren commented Sep 18, 2023

New behaviours

Changes

stefvanbuuren commented Sep 18, 2023

stefvanbuuren commented Sep 22, 2023 • edited Loading

New behaviours and features thus far

stefvanbuuren commented Oct 2, 2023 • edited Loading

stephematician Apr 13, 2024

Choose a reason for hiding this comment

stephematician Apr 13, 2024

Choose a reason for hiding this comment

stephematician Jun 17, 2024

Choose a reason for hiding this comment

stefvanbuuren commented Sep 11, 2023 •

edited

Loading

stefvanbuuren commented Sep 11, 2023 •

edited

Loading

stefvanbuuren commented Sep 13, 2023 •

edited

Loading

stefvanbuuren commented Sep 22, 2023 •

edited

Loading

stefvanbuuren commented Oct 2, 2023 •

edited

Loading